Search CORE

365 research outputs found

Joint analysis of SNP and gene expression data in genetic association studies of complex diseases

Author: Huang Yen-Tsung
Lin Xihong
VanderWeele Tyler J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 25/04/2014
Field of study

Genetic association studies have been a popular approach for assessing the association between common Single Nucleotide Polymorphisms (SNPs) and complex diseases. However, other genomic data involved in the mechanism from SNPs to disease, for example, gene expressions, are usually neglected in these association studies. In this paper, we propose to exploit gene expression information to more powerfully test the association between SNPs and diseases by jointly modeling the relations among SNPs, gene expressions and diseases. We propose a variance component test for the total effect of SNPs and a gene expression on disease risk. We cast the test within the causal mediation analysis framework with the gene expression as a potential mediator. For eQTL SNPs, the use of gene expression information can enhance power to test for the total effect of a SNP-set, which is the combined direct and indirect effects of the SNPs mediated through the gene expression, on disease risk. We show that the test statistic under the null hypothesis follows a mixture of

\chi^2

distributions, which can be evaluated analytically or empirically using the resampling-based perturbation method. We construct tests for each of three disease models that are determined by SNPs only, SNPs and gene expression, or include also their interactions. As the true disease model is unknown in practice, we further propose an omnibus test to accommodate different underlying disease models. We evaluate the finite sample performance of the proposed methods using simulation studies, and show that our proposed test performs well and the omnibus test can almost reach the optimal power where the disease model is known and correctly specified. We apply our method to reanalyze the overall effect of the SNP-set and expression of the ORMDL3 gene on the risk of asthma.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS690 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Semiparametric Normal Transformation Models for Spatially Correlated Survival Data

Author: Li Yi
Lin Xihong
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/09/2005
Field of study

There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations

Collection Of Biostatistics Research Archive

Nonparametric Regression Using Local Kernel Estimating Equations for Correlated Failure Time Data

Author: Lin Xihong
Yu Zhangsheng
Publication venue: Collection of Biostatistics Research Archive
Publication date: 30/08/2006
Field of study

Collection Of Biostatistics Research Archive

Testing the Correlation for Clustered Categorical and Censored Discrete Time‐to‐Event Data When Covariates Are Measured without/with Errors

Author: Li Yi
Lin Xihong
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/92373/1/1541-0420.00004.pd

CiteSeerX

Deep Blue Documents at the University of Michigan

Analysis of Case-Control Age-at-Onset Data Using a Modified Case-Cohort Method

Author: Lin Xihong
Nan Bin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 20/11/2006
Field of study

Case-control designs are widely used in rare disease studies. In a typical case-control study, data are collected from a sample of all available subjects who have experienced a disease (cases) and a sub-sample of subjects who have not experienced the disease (controls) in a study cohort. Cases are often oversampled in case-control studies. Logistic regression is a common tool to estimate the relative risks of the disease and a set of covariates. Very often in such a study, information of ages-at-onset of the disease for all cases and ages at survey of controls are known. Standard logistic regression analysis using age as a covariate is based on a dichotomous outcome and does not efficiently use such age-at-onset (time-to-event) information. We propose to analyze age-at-onset data using a modified case-cohort method by treating the control group as an approximation and show that the asymptotic bias of the proposed estimator is small when the disease rate is low. We evaluate the finite sample performance of the proposed method through a simulation study and illustrate the method using a breast cancer case-control data set

CiteSeerX

Collection Of Biostatistics Research Archive

The Effect of Correlation in False Discovery Rate Estimation

Author: Lin Xihong
Schwartzman Armin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 06/07/2009
Field of study

Collection Of Biostatistics Research Archive